Soren: Adaptive MapReduce for Programmable GPUs
نویسندگان
چکیده
In recent years the MapReduce programming model has been widely used for developing parallel data-intensive applications. As a result of its popularity, there exist many implementations of the MapReduce model on different parallel architectures including on massively parallel programmable GPUs. A basic challenge in implementing a MapReduce runtime system is the wide diversity of applications developed based on the model. That means a fixed implementation of the MapReduce runtime system may become suboptimal for some classes of applications. In this paper, we propose an adaptive framework for MapReduce on GPUs which is capable of monitoring key characteristics of applications and dynamically executing them efficiently in one of the three variations of the MapReduce engine it implements. Our preliminary results show that our adaptive method can significantly improve performance for many MapReduce applications (including a 11x performance speedup in one case) compared to a state-of-the-art MapReduce implementation on GPUs.
منابع مشابه
Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملAn Adaptive Framework for Managing Heterogeneous Many-Core Clusters
The computing needs and the input and result datasets of modern scientific and enterprise applications are growing exponentially. To support such applications, High-Performance Computing (HPC) systems need to employ thousands of cores and innovative data management. At the same time, an emerging trend in designing HPC systems is to leverage specialized asymmetric multicores, such as IBM Cell an...
متن کاملCo-processing SPMD Computation on GPUs and CPUs on Shared Memory System
Heterogeneous parallel system with multi processors and accelerators are becoming ubiquitous due to better cost-performance and energy-efficiency. These heterogeneous processor architectures have different instruction sets and are optimized for either task-latency or throughput purposes. Challenges occur in regard to programmability and performance when executing SPMD computations on heterogene...
متن کاملAccelerating Mahout on Heterogeneous Clusters Using Hadoopcl
MapReduce is a programming model capable of processing massive data in parallel across hundreds of computing nodes in a cluster. It hides many of the complicated details of parallel computing and provides a straightforward interface for programmers to adapt their algorithms to improve productivity. Many MapReduce-based applications have utilized the power of this model, including machine learni...
متن کاملA Map Reduce Framework for Programming Graphics Processors
Recent developments in programmable, highly parallel Graphics Processing Units (GPUs) have enabled high performance general purpose computation. We describe a framework designed for high performance GPU programming, built on Nvidia’s Compute Unified Device Architecture (CUDA) platform. The framework is built around the Map Reduce abstraction, which allows application developers to focus on thei...
متن کامل